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Introduction 


The  vascular  endothelial  growth  factor  (VEGF)  family  of  cytokines  promotes  vascularization,  tumorigenesis  and 
metastasis  in  many  cancers  (7).  Although  prostate  cancer  is  vasculature-  and  VEGF-dependent,  Phase  III 
clinical  trials  have  failed  to  show  benefit  for  the  anti-VEGF  bevacizumab  (2).  Some  recent  Phase  II  clinical  trials 
for  castration-resistant  prostate  cancer  have  revived  hope  in  this  treatment  (3,  4)  and  underscored  the  need  to 
better  understand  these  drugs  and  their  targets.  In  order  to  improve  our  understanding,  we  will  create  prostate- 
cancer-specific  computational  models  of  VEGF  and  its  receptors.  These  models  will  be  used  to  simulate 
therapies  that  target  the  pathway.  The  therapies  to  be  tested  include  anti-ligands  such  as  bevacizumab  but 
also  anti-receptors  and  small  molecules  such  as  tyrosine  kinase  inhibitors  (5).  In  this  way,  we  can  build  on  both 
the  successes  and  the  failures  of  anti-VEGF  trials  to  date  in  order  to  develop  more  effective  therapies  for 
prostate  cancer. 


Body 

In  this  section,  we  will  go  through  the  five  tasks  outlined  in  the  statement  of  work  and  describe  the  progress 
made  towards  accomplishing  these  tasks. 

Task  1.  Collate  the  publicly  available  prostate  cancer  gene  expression  datasets,  (months  1-36) 

la.  Collate  the  currently  available  datasets  (months  1-3) 

lb.  Develop  a  monitoring  policy  for  new  datasets  and  (months  1-3) 

lc.  Incorporate  new  datasets  as  they  become  available  (months  4-36) 

Progress  to  date  for  Task  1:  We  have  identified  the  publicly  available  datasets  for  prostate  cancer  gene 
expression,  and  we  have  assembled  these  into  working  databases  for  our  needs.  Because  these  high- 
throughput  studies  use  different  platforms,  we  keep  those  separate  and  only  perform  comparative  analyses  on 
datasets  with  the  same  platform.  For  example,  we  do  not  mix  RNASeq  datasets  with  microarray  datasets. 
However,  we  do  compare  the  outcomes  of  these  analyses,  to  determine  whether  there  the  qualitative  insights 
are  consistent  from  set  to  set.  An  example  of  this  methodology  is  available  in  our  published  breast  cancer 
study  (6).  Specifically,  for  the  prostate  cancer  study  and  for  the  results  presented  here,  we  are  working 
primarily  with  176  samples  from  the  TCGA  study,  quantified  using  RNASeq.  As  defined  in  the  statement  of 
work,  we  will  continue  to  collect  and  analyze  datasets  as  the  project  progresses. 


Task  2.  Analyze  the  gene  expression  datasets  (months  1-12) 

2a.  Bioinformatic  analysis  of  angiogenesis  genes  for  prostate  cancer  (months  1-9) 

2b.  Write  manuscript  describing  findings  (months  9-12) 

Progress  to  date  for  Task  2:  We  have  analyzed  the  collated  prostate  cancer  data. 

First,  we  performed  principal  component  analysis  (PCA)  of  39  VEGF  and  semaphorin  ligands  and  receptors  in 
prostate  tumor  samples  to  find  patterns  of  co-regulation  among  these  genes  (Fig.  1),  with  comparisons  to 
breast  and  kidney  cancer.  Four  out  of  five  VEGF  receptors  (FLT1,  KDR,  FLT4,  and  NRP2)  had  large  negative 
loadings  on  the  first  prostate  principal  component,  indicating  that  these  genes  had  high  correlations  in  the 
prostate  cancer  dataset.  This  pattern  was  also  noted  in  the  breast  and  kidney  cancer  datasets,  suggesting  that 
correlation  of  the  expression  of  multiple  VEGF  receptors  may  be  common  to  multiple  types  of  tumors. 

The  second  principal  component  in  the  prostate  dataset  was  notable  due  to  the  large  negative  loadings  for 
SEMA3A,  SEMA3C,  SEMA3D,  and  SEMA3E,  with  a  large  positive  loading  for  NRP1  (Fig.  1).  NRP1  promotes 
VEGF  signaling,  while  class  3  semaphorins  typically  inhibit  angiogenesis;  thus,  samples  with  high  positive 
second  principal  components  likely  possess  a  pro-angiogenic  signature. 
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We  used  K-means  clustering  to  determine  groups  of  prostate  tumor  samples  with  distinct  patterns  of  VEGF 
and  semaphorin  expression.  Consensus  methods,  where  clustering  was  performed  on  random  subsets  of  the 
data  multiple  times,  showed  that  3  clusters  gave  the  best  clustering  (Fig.  2).  These  3  clusters  were  primarily 
distinguished  by  scores  of  principal  component  1  (PCI),  as  shown  in  the  clustering  heatmap  (Fig.  3).  Thus, 
one  cluster  has  high  expression  of  the  FLT1/KDR/FLT4/NRP2  signature  identified  by  principal  component 
analysis,  while  another  cluster  has  low  expression  of  this  signature. 

We  expanded  the  list  of  genes  of  interest  by  augementing  the  VEGF  and  Semaphorin  families  with  other 
angiogenesis-related  growth  factors  and  receptors.  Examining  this  expanded  dataset,  we  wanted  to  identify 
whether  a  gene  expression-based  indicator  could  be  found  in  prostate  cancer  that  correlated  with  significant 
gene  expression  changes  across  the  whole  angiogenesis  network.  Thus,  we  screened  all  the  genes  in  the 
RNA-Seq  dataset  for  genes  whose  expression  had  a  bimodal  distribution.  Of  the  1 1  genes  found  to  have 
bimodal  distributions,  one  of  them,  ERG,  resulted  in  high  classification  accuracy  (93%)  when  a  partial  least 
squares  discriminant  analysis  (PLS-DA)  model  was  built  using  the  85-gene  dataset  (Fig.  4).  ERG+  samples 
had  higher  expression  of  PDGFA  and  PDGFC,  and  lower  expression  of  SEMA3E  and  SEMA3F.  High 
expression  of  ERG  is  correlated  with  fusion  of  the  TMPRSS2  and  ERG  genes,  and  is  associated  with  poorer 
prognosis  in  prostate  cancer  patients  (7). 

We  are  preparing  a  manuscript  based  on  these  findings  and  continuing  to  analyze  the  data  to  identify 
additional  insights. 


Task  3.  Build  canonical  prostate  cancer  VEGF  transport  model  (months  1-15) 

3a.  Collate  anatomical  and  other  prostate-specific  parameters  (months  1-12) 

3b.  Build  computational  model  of  prostate  cancer  within  human  body  (months  3-6) 

3c.  Simulate  and  analyze  prostate  cancer  with  representative  (average)  gene  expression  (months  6-12) 

3d.  Write  manuscript  based  on  canonical  (average)  prostate  cancer  VEGF  model  (months  12-15) 

3e.  Post  code  for  model  on  public  model  database  sites  (after  manuscript  acceptance) 

Progress  to  date  for  Task  3:  We  have  built  the  transport  model  for  prostate  cancer.  The  efficacy  of  drugs  that 
target  proteins  is  dependent  on  the  protein  interaction  network  of  the  target.  There  are  several  VEGF  isoforms 
and  multiple  VEGF  receptors  on  endothelial  cells  (8).  The  overall  vascular  response  to  the  drug  is  not  obvious 
because  of  the  multiple  competing  ligands  and  receptors.  Only  by  including  all  of  these  in  a  computational 
simulation  can  we  find  the  impact  of  the  drug,  and  how  that  impact  changes  depending  on  the  variable 
expression  of  the  competing  ligands  and  receptors. 

To  show  the  utility  of  this  method,  and  to  show  preliminary  data  for  Task  4,  we  simulated  the  transport  and 
receptor  binding  of  VEGFA  in  compartmental  models  representing  a  population  of  prostate  cancer  patients. 

The  simulation  results  show  multiple  different  metrics  that  the  model  can  output  to  relate  to  angiogenesis 
signaling.  Each  dot  on  the  graphs  represents  a  different  individual's  tumor.  Fig.  5  shows  the  effect  of  each  of  4 
gene  expression  inputs  on  various  simulated  model  features.  VEGFA  expression  (left  column)  has  the 
strongest  association  with  the  features  related  to  VEGF  concentrations  and  receptor  binding.  Surprisingly,  the 
correlations  between  VEGFA  expression  and  total  VEGFR1/VEGFR2  binding  are  higher  than  the  correlations 
between  VEGFA  expression  and  plasma  total  VEGF/tumorfree  VEGF.  This  kind  of  insight  is  very  relevant 
when  it  comes  to  assessing  plasma  VEGF  as  a  potential  biomarker  for  diagnosis,  prognosis,  and  evaluation  of 
treatment  response. 


Task  4.  Build  virtual  patient  bank  (months  12-36) 

4a.  Incorporate  available  patient-specific  parameters  into  computational  models  (months  12-18) 

4b.  Monitor  new  datasets  (see  #1c  above)  and  add  virtual  patients  as  appropriate  (months  18-36) 

Progress  on  Task  4:  As  per  the  statement  of  work,  the  substantial  work  on  Task  4  will  take  place  in  Years  2 
and  3  of  the  grant.  Some  preliminary  work  for  Task  4  was  described  in  Task  3  above. 


5 


Task  5.  Test  anti-VEGF  pathway  therapies  against  the  virtual  patient  population  (months  18-36) 

5a.  Incorporate  anti-VEGF  pathway  therapies  into  the  canonical  code  (months  18-21) 

5b.  Test  the  anti-VEGF  pathway  therapies  against  the  population  of  virtual  patient  models  (months  21-36) 

5c.  Analyze  the  data  from  these  results,  in  particular  a  bioinfomatic  analysis  to  reduce  the  number  of  possible 
therapies  and  identify  contiguous  therapy-responsive  subgroups  of  patients  with  accompanying  biomarkers 
(months  21-36) 

5d.  Write  manuscript  based  on  virtual  patient-specific  predictions  (months  33-36) 

5e.  Post  code  for  model  on  public  model  database  sites  (after  manuscript  acceptance) 

Progress  on  Task  5:  As  per  the  statement  of  work,  work  on  Task  5  will  take  place  in  Years  2  and  3  of  the 
grant. 


Key  Research  Accomplishments 

•  Development  of  prostate-specific  computational  model  of  VEGF  transport 

•  Creation  and  simulation  of  first-draft  patient-specific  computational  models 

•  Identification  of  possible  biomarkers  of  multi-gene  growth  factor  networks  in  prostate  cancer 


Reportable  Outcomes 

None  yet.  We  anticipate  publications  to  report  by  the  end  of  Year  2. 


Conclusion 

As  described  above,  we  have  collated  prostate  cancer  gene  expression  data,  analyzed  the  data  using 
bioinformatic  techniques,  and  created  new  computational  models  to  simulate  prostate  cancer,  based  on  the 
individualized  gene  expression  data.  This  progress  will  continue,  and  we  will  be  able  to  develop  models  of 
therapies  including  bevacizumab  and  other  drugs,  in  order  to  design  improved  therapeutic  approaches  (both 
for  individuals  and  for  the  population). 

So  what:  Without  the  computational  model,  the  extensive  knowledge  of  mechanisms  that  have  resulted  from 
many  researchers  studying  these  pathways  could  not  be  used  in  combination  with  the  high-throughput  data.  In 
short,  our  models  allow  us  to  place  the  individualized  data  in  its  correct  context  -  the  complex  molecular 
interaction  networks  inside  tumors. 
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Figure  1.  Principal  Component  analysis  for  Prostate  Cancer.  These  loading  plots  show  the  key  genes  that 
make  up  the  first  and  second  principal  components  (PCs)  of  these  data  sets.  The  breast  and  kidney  cancer 
plots  are  included  for  comparison;  it  is  important  to  place  prostate  cancer  in  context  with  other  cancers,  in  order 
to  be  able  to  interpret  how  the  success  or  failure  of  VEGF-targeting  therapeutics  in  those  cancers  can  provide 
guideposts  for  prostate  cancer  therapeutics. 
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Figure  2.  K-means  clustering  of  prostate  cancer  data  shows  that  creating  three  subgroups  provides  the 
cleanest  and  most  consistent  separation  of  data. 
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Figure  3.  Clustering  of  prostate  cancer  data  shows  the  three  clusters  that  emerge  from  the  gene  expression 
of  the  VEGF  family  of  growth  factors  and  their  receptors,  and  of  the  Semaphorin  family  (which  competes  with 
VEGFs  for  binding  the  key  Neuropilin  receptors).  Red  indicates  higher  expression,  green  indicates  lower 
expression.  Values  for  three  principal  components  (PC)  are  also  shown. 
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Figure  4.  ERG  expression  is  associated  with  differences  in  expression  of  key  angiogenesis-related  proteins. 
Left,  ERG  expression  is  bimodal  within  the  prostate  cancer  population.  Right,  the  principal  components  of 
angiogenesis-related  gene  expression  are  highly  predictive  of  ERG  expression. 
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Figure  5.  Whole-body  simulations  of  VEGF  distribution  in  prostate  cancer.  Each  dot  represents  one 
individual  tumor,  each  with  a  unique  signature  of  gene  expression  across  the  multiple  growth  factors  and 
receptors  of  the  VEGF  family.  These  simulations  predicts  multiple  outputs  related  to  VEGF  signaling,  such  as 
ligated  receptors. 


12 


